Comparison of Classification and Ranking Approaches to Pronominal Anaphora Resolution in Czech

نویسندگان

  • Giang Linh Nguy
  • Václav Novák
  • Zdenek Zabokrtský
چکیده

In this paper we compare two Machine Learning approaches to the task of pronominal anaphora resolution: a conventional classification system based on C5.0 decision trees, and a novel perceptron-based ranker. We use coreference links annotated in the Prague Dependency Treebank 2.0 for training and evaluation purposes. The perceptron system achieves f-score 79.43% on recognizing coreference of personal and possessive pronouns, which clearly outperforms the classifier and which is the best result reported on this data set so far.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pronominal Reference Type Identification and Event Anaphora Resolution for Hindi

In this paper, we present hybrid approaches for pronominal reference type (abstract or concrete) identification and event anaphora resolution for Hindi. Pronominal reference type identification is one of the important parts for any anaphora resolution system as it helps anaphora resolver in optimal feature selection based on pronominal reference types. We use language specific rules and feature...

متن کامل

Application of Pronominal Divergence and Anaphora Resolution in English-Hindi Machine Translation

So far the majority of Machine Translation (MT) research has focused on translation at the level of individual sentences. For sentence level translation, Machine Translation has addressed various divergence issues for large variety of languages; the issue of pronominal divergence has been presented only recently. Since the quality of translation as required by users follows coherent multi-sente...

متن کامل

CoreferenCe Chains in CzeCh, english and russian: Preliminary findings1

This paper is a pilot comparative study on coreference chaining in three languages, namely, Czech, English and Russian. We have analyzed 16 parallel English-Czech newspaper texts and 16 texts in Russian (similar to the English-Czech ones in length and topics). Our motivation was to find out what the linguistic structure of coreference chains in different languages is and what types of distincti...

متن کامل

Conditional Random Fields based Pronominal Resolution in Tamil

This paper deals with Tamil pronominal resolution using Conditional Random Fields a machine learning approach. A detailed linguistic analysis of Tamil pronominals and its antecedence occurring in various syntactic constructs is done, which led to the selection of appropriate features for CRF approach. The syntactic features thus identified made the system learn most frequently occurring pronoun...

متن کامل

A Mention-Ranking Model for Abstract Anaphora Resolution

Resolving abstract anaphora is an important, but difficult task for text understanding. Yet, with recent advances in representation learning this task becomes a more tangible aim. A central property of abstract anaphora is that it establishes a relation between the anaphor embedded in the anaphoric sentence and its (typically non-nominal) antecedent. We propose a mention-ranking model that lear...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009